Multi-Font Farsi/Arabic Isolated Character Recognition Using Chain Codes
نویسنده
چکیده
Nowadays, OCR systems have got several applications and are increasingly employed in daily life. Much research has been done regarding the identification of Latin, Japanese, and Chinese characters. However, very little investigation has been performed regarding Farsi/Arabic characters recognition. Probably the reason is difficulty and complexity of those characters identification compared to the others and limitation of IT activities in Farsi and Arabic speaking countries. In this paper, a technique has been employed to identify isolated Farsi/Arabic characters. A chain code based algorithm along with other significant peculiarities such as number and location of dots and auxiliary parts, and the number of holes existing in the isolated character has been used in this study to identify Farsi/Arabic characters. Experimental results show the relatively high accuracy of the method developed when it is tested on several standard Farsi fonts. Keywords—Farsi characters, OCR, feature extraction, chain code.
منابع مشابه
A New Method to Improve Multi Font Farsi/Arabic Character Segmentation Results: Using Extra Classes of Some Character Combinations
A new segmentation algorithm for multifont Farsi/Arabic texts based on conditional labeling of up and down contours was presented in [1]. A preprocessing technique was used to adjust the local base line for each subword. Adaptive base line, up and down contours and their curvatures were used to improve the segmentation results. The algorithm segments 97% of 22236 characters in 18 fonts correctl...
متن کاملHybrid of Rough Neural Networks for Arabic/Farsi Handwriting Recognition
Handwritten character recognition is one of the focused areas of research in the field of Pattern Recognition. In this paper, a hybrid model of rough neural network has been developed for recognizing isolated Arabic/Farsi digital characters. It solves the neural network problems; proneness to overfitting, and the empirical nature of model development using rough sets and the dissimilarity analy...
متن کاملA Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research
This paper presents a new comprehensive database for isolated offline handwritten Farsi/Arabic numbers and characters for use in optical character recognition research. The database is freely available for academic use. So far no such a freely database in Farsi language is available. Grayscale images of 52,380 characters and 17,740 numerals are included. Each image was scanned from Iranian scho...
متن کاملApplication of Fractal Codes in Recognition of Isolated Handwritten Farsi/Arabic Characters and Numerals
In this paper we proposed a new method for isolated handwritten Farsi/Arabic characters and numerals recognition using fractal codes. Fractal codes represent affine transformations which when iteratively applied to the range-domain pairs in an arbitrary initial image, the result is close to the given image. Each fractal code consists of six parameters such as corresponding domain coordinates fo...
متن کاملA study on font-family and font-size recognition applied to Arabic word images at ultra-low resolution
In this paper, we propose a new font and size identification method for ultra-low resolution Arabic word images using a stochastic approach. The literature has proved the difficulty for Arabic text recognition systems to treat multi-font and multi-size word images. This is due to the variability induced by some font family, in addition to the inherent difficulties of Arabic writing including cu...
متن کامل